A two-level multithreaded Delaunay kernel
نویسنده
چکیده
This paper presents a fine grain parallel version of the 3D Delaunay Kernel procedure using the OpenMP (Open Multi-Processing) API. A set S = {p1, . . . , pn} of n points is taken as input. S is initially sorted along a space-filling curve so that two points that are close in the insertion order are also close geometrically. The sorted set of point is then divided into M subsets S i, 1 ≤ i ≤ M of equal size n/M. The multithreaded version of the Delaunay kernel inserts M points at a time in the triangulation. OpenMP barriers provide the required synchronization that is needed after each multiple insertion in order to avoid data races. This simple approach exhibits two standard problems of parallel computing: load imbalance and parallel overheads. Those two issues are addressed using a two-level version of the multithreaded Delaunay kernel. Tests show that triangulations of about a billion tetrahedra can be generated on a 32 core machine (Intel Xeon E5-4610 v2 @ 2.30GHz with with 128 GB of memory) in less that 3 minutes of wall clock time, with a speedup of 18 compared to the single-threaded implementation. c © 2015 The Authors. Published by Elsevier Ltd. Peer-review under responsibility of organizing committee of the 24th International Meshing Roundtable (IMR24).
منابع مشابه
The Design and Construction of a User - Level Kernel forTeaching Multithreaded
| Multithreading is a powerful programming paradigm that has become very popular in recent years. The authors have developed a set of course materials and software tools for eeectively teaching multithreaded programming (MTP). One important component of the au-thors' system is a very simple user-level kernel for instructors to teach MTP without getting into system details , and for the students...
متن کاملAlgorithm, software, and hardware optimizations for Delaunay mesh generation on simultaneous multithreaded architectures
This article focuses on the optimization of PCDM, a parallel, two-dimensional (2D) Delaunay mesh generation application, and its interaction with parallel architectures based on simultaneous multithreading (SMT) processors. We first present the step-by-step effect of a series of optimizations on performance. These optimizations improve the performance of PCDM by up to a factor of six. They targ...
متن کاملCommunication and Synchronization in Multithreaded Reconfigurable Computing Systems
This paper describes an approach to provide communication and synchronization services to hardware threads being executed on reconfigurable devices under the control of a software-based operating system. This work aims at enabling hardware circuits to be modeled as active, independently executing threads with access to all operating system services, instead of passive coprocessors that can simp...
متن کاملThe Design and Construction of a User - Level Kernel
| Multithreading is a powerful programming paradigm that has become very popular in recent years. The authors have developed a set of course materials and software tools for eeectively teaching multithreaded programming (MTP). One important component of the au-thors' system is a very simple user-level kernel for instructors to teach MTP without getting into system details , and for the students...
متن کاملUsing Kernel Coupling to Improve the Performance of Multithreaded Applications
Kernel coupling refers to the effect that kernel i has on kernel j in relation to running each kernel in isolation. The two kernels can correspond to adjacent kernels or a chain of three or more kernels in the control flow of an application. In previous work, we used kernel coupling to provide insights on where further algorithm and code implementation work was needed to improve performance, in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computer-Aided Design
دوره 85 شماره
صفحات -
تاریخ انتشار 2017